Robust supervised learning with coordinate gradient descent
نویسندگان
چکیده
This paper considers the problem of supervised learning with linear methods when both features and labels can be corrupted, either in form heavy tailed data and/or corrupted rows. We introduce a combination coordinate gradient descent as algorithm together robust estimators partial derivatives. leads to statistical that have numerical complexity nearly identical non-robust ones based on empirical risk minimization. The main idea is simple: while requires computational cost robustly estimating whole update all parameters, parameter updated immediately using estimator single derivative descent. prove upper bounds generalization error algorithms derived from this idea, control optimization errors without strong convexity assumption risk. Finally, we propose an efficient implementation approach new Python library called linlearn, demonstrate through extensive experiments our introduces interesting compromise between robustness, performance efficiency for problem.
منابع مشابه
Robust Block Coordinate Descent
In this paper we present a novel randomized block coordinate descent method for the minimization of a convex composite objective function. The method uses (approximate) partial second-order (curvature) information, so that the algorithm performance is more robust when applied to highly nonseparable or ill conditioned problems. We call the method Robust Coordinate Descent (RCD). At each iteratio...
متن کاملLearning Structured Classifiers with Dual Coordinate Descent
We present a unified framework for online learning of structured classifiers. This framework handles a wide family of convex loss functions that includes as particular cases CRFs, structured SVMs, and the structured perceptron. We introduce a new aggressive online algorithm that optimizes any loss in this family; for the structured hinge loss, this algorithm reduces to 1-best MIRA; in general, ...
متن کاملLearning Output Kernels with Block Coordinate Descent
We propose a method to learn simultaneously a vector-valued function and a kernel between its components. The obtained kernel can be used both to improve learning performance and to reveal structures in the output space which may be important in their own right. Our method is based on the solution of a suitable regularization problem over a reproducing kernel Hilbert space of vector-valued func...
متن کاملLearning to learn by gradient descent by gradient descent
The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorit...
متن کاملLarge Scale Semi - supervised Linear SVM with Stochastic Gradient Descent ⋆
Semi-supervised learning tries to employ a large collection of unlabeled data and a few labeled examples for improving generalization performance, which has been proved meaningful in real-world applications. The bottleneck of exiting semi-supervised approaches lies in over long training time due to the large scale unlabeled data. In this article we introduce a novel method for semi-supervised l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistics and Computing
سال: 2023
ISSN: ['0960-3174', '1573-1375']
DOI: https://doi.org/10.1007/s11222-023-10283-7